Allow apply_chat_template to pass kwargs to the template and support a dict of templates#29658
Allow apply_chat_template to pass kwargs to the template and support a dict of templates#29658Rocketknight1 merged 11 commits intomainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| tokenizer_kwargs: Optional[Dict[str, Any]] = None, | ||
| **kwargs, |
There was a problem hiding this comment.
Is this a breaking change? 👀
Previously passed "tokenizer kwargs" will now be passed to "kwargs"
| template_dict = self.chat_template or self.default_chat_template | ||
| if chat_template is not None and chat_template in template_dict: | ||
| # The user can pass the name of a template to the chat template argument instead of an entire template | ||
| chat_template = template_dict[chat_template] |
There was a problem hiding this comment.
At this point, isn't chat_template the actual chat template that's passed as input?
chat_template (str, *optional*): A Jinja template to use for this conversion.
It's being used as the key in the template_dict, but what if the user had passed a chat_template to be used as a template?
There was a problem hiding this comment.
It seems like if self.chat_template is a dict, when passing in chat_template it is always considered a key when it could be a template. If self.chat_template isn't a dict and self.default_chat_template isn't either, then chat_template is now considered a chat template
There was a problem hiding this comment.
It's a bit hacky, but chat_template is only used as a key if it exists as a key in the template_dict. If the user passes an actual Jinja template, that almost certainly will not exist as a key, and so chat_template is treated as a template string.
LysandreJik
left a comment
There was a problem hiding this comment.
Awesome, this looks good to me! Thanks @Rocketknight1
This'll end up changing some templates on the Hub so let's let others that depend on it know.
As the title suggests, this is a simple PR that allows two new features:
kwargscan be passed throughapply_chat_templateto the template renderer.apply_chat_template(this is used by the new Command-R model)This PR is very slightly breaking (we used to pass kwargs to the tokenizer), but I don't think this was commonly used, and I think the new usage is more intuitive. Users can still pass kwargs through the method to the tokenizer with the
tokenizer_kwargsargument.Other than that, it should have no effect on existing users/templates!